52 research outputs found
An Efficient Cell List Implementation for Monte Carlo Simulation on GPUs
Maximizing the performance potential of the modern day GPU architecture
requires judicious utilization of available parallel resources. Although
dramatic reductions can often be obtained through straightforward mappings,
further performance improvements often require algorithmic redesigns to more
closely exploit the target architecture. In this paper, we focus on efficient
molecular simulations for the GPU and propose a novel cell list algorithm that
better utilizes its parallel resources. Our goal is an efficient GPU
implementation of large-scale Monte Carlo simulations for the grand canonical
ensemble. This is a particularly challenging application because there is
inherently less computation and parallelism than in similar applications with
molecular dynamics. Consistent with the results of prior researchers, our
simulation results show traditional cell list implementations for Monte Carlo
simulations of molecular systems offer effectively no performance improvement
for small systems [5, 14], even when porting to the GPU. However for larger
systems, the cell list implementation offers significant gains in performance.
Furthermore, our novel cell list approach results in better performance for all
problem sizes when compared with other GPU implementations with or without cell
lists.Comment: 30 page
Improved fault recovery for core based trees
The demand for multicast communication in wide-area networks, such as the internet, is increasing. Core based trees is one protocol that has been proposed to support scalable multicasting for sparse groups. When faults occur in the network nodes or links of the tree, the tree can become disconnected. In this paper, we propose an efficient protocol for recovering from faults in a core based tree. One of the key ideas is a technique for restructuring the disconnected subtree so that a loop-free path to the core can be found. The correctness of this protocol is also proved
The Impact of Output Selection Function Choice on the Performance of Adaptive Wormhole Routing
Many adaptive routing algorithms have been proposed for wormhole-routed interconnection networks. Comparatively little work, however, has been done on determining how the output selection function (routing policy) affects the performance of an adaptive routing algorithm. In this paper, we present a detailed simulation study of various selection functions for a fully adaptive mesh routing algorithm. The simulation results show that the choice of selection function has a significant effect on the average message latency. Thus, a naive implementation of an adaptive routing algorithm may lead to poor performance. These selection functions are also compared with a theoretically optimal selection function [1]. We show that although theoretically optimal, the actual performance of the optimal selection function is not best. An explanation and interpretation of the results is provided.
- …